Load Balancing with PCCIS

The PCC architecture is created such that PCCIS Servers are designed to be stand-alone servers. This means that each PCCIS server is unaware of any other PCCIS servers processing content, even for the same application. All content caching and processing happens in isolation within a single server and is not accessible to other servers. This design favors simplicity by reducing the setup and maintenance effort associated with shared state, but at the cost of making each individual PCCIS server stateful.

The PCC client viewer will continually require information from PCCIS in order to display the desired content while a viewing session is active. All requests for a viewing session need to be directed to the same PCCIS server that created it. This is important if you intend to use a bank of two or more PCCIS servers behind a single end-point (e.g. a load balancer) to support of your viewing application.

There are two ways that we recommend to maintain the relationship between a specific client and back-end PCCIS server. The two methods, which we’ll refer to as Client Affinity and Proxy Based routing, respectively, are discussed in more detail below.

Client Affinity Based Routing

In this method, a load balancer sitting between your PCC viewer application and the PCCIS server bank is independently responsible for maintaining the relationship between the two end points. An important distinction between this type of routing and Proxy Based Routing is that in this type, the relationship is maintained without any specific knowledge of how PCC works. A load balancer typically accomplishes this by creating a unique ID to establish a mapping between a specific client and a backend server. The load balancer uses its internal load balancing algorithm to select a PCCIS server when it receives the first request from a new client. Then, using an HTTP cookie, it instructs the client to include that ID with any future requests so that they will be routed to the same PCCIS server.

This routing method is generally the easiest to set up, primarily because most off-the-shelf load balancing products include this functionality as a built-in feature like Amazon Web Services’ Elastic Load Balancing and Microsoft’s Application Request Routing.

However, because the load balancer indepently maintains a relationship between a specific client (e.g. a single user) and PCCIS server, a client has a tendency to get "stuck" to a single PCCIS server for an extended period of time. This period of time varies depending on the implementation, but is likely to be until the cookie containing the load balancer mapping ID expires or until the user deletes their browser cookies. This "stickiness" decreases the effectiveness of the load balancing algorithm, though not enough to become a problem when viewing request load across the entire system.

Proxy Based Routing

The Proxy Based routing method is so named because it requires a proxy server that sits in front of the PCCIS server bank and exposes the PCCIS end-points to your PCC viewer application. Like the Client Affinity method, this proxy server is responsible for applying a load balancing algorithm to distribute traffic evenly throughout the bank of PCCIS servers behind it. But unlike the Client Affinity method, the proxy method uses a bit of PCC-specific information already available in the requests to route them to the correct PCCIS server.

This method offers the advantage of "sticking" clients to a specific PCCIS server for a much shorter period of time versus the Client Affinity method. Each time a client causes a new viewing session to be created in your PCC viewer application, the new viewing session will be created on a PCCIS server as selected by the load balancing algorithm. This allows load balancing to be more effective at distributing load across the PCCIS servers than the Client Affinity method.

Proxy Based routing will require some development and IT effort to implement in your application. If interested, please continue reading below for more details on how this routing method works.

How It Works

Proxy Based routing works by making use of an ID that is associated with every viewing session PCCIS creates. Specifically, this viewing session ID is an encrypted and Base64 encoded value that encapsulate real data, part of which is the hostname of the PCCIS server that created it. We’ll get back to the format of the viewing session ID, but first a quick mention about when the ID is created and how it can be retrieved from the requests sent to PCCIS.

Locating the Viewing Session ID

The viewing session ID is first created during the initial HTTP POST request to create a new viewing session. The viewing session ID is returned in the associated response body as a JSON property named "viewingSessionId".

Example	Copy Code
POST /PCCIS/V1/ViewingSession { "tenantId": "My Application Name", “externalId": "MyDocument.pdf", “render": { "flash": { "optimizationLevel": 1 }, "html5": { "alwaysUseRaster": false } } } 200 OK { “viewingSessionId”: “pMaHhbPi” }

Example

Copy Code

POST /PCCIS/V1/ViewingSession
{
  "tenantId": "My Application Name",
  “externalId": "MyDocument.pdf",
  “render": {
    "flash": {
      "optimizationLevel": 1
    },
    "html5": {
      "alwaysUseRaster": false
    }
  }
}

200 OK
{
  “viewingSessionId”: “pMaHhbPi”
}

Every subsequent request to PCCIS in the context of this viewing session must include the viewing session ID. It is these HTTP requests that the proxy must intercept and obtain the viewing session ID from in order to get the hostname of the PCCIS server the request is intended for. Below are a few valid requests that are sent to PCCIS which demonstrate the various locations of the viewing session ID.

Example	Copy Code
PUT /ViewingSession/q/SourceFile?ViewingSessionId=upMaHhbPi&FileExtension=pdf { [document data] } 200 OK GET /PCCIS/V1/ViewingSession/upMaHhbPi/Notification/SessionStarted 200 OK GET /PCCIS/V1/Page/q/0?DocumentID=upMaHhbPi&Scale=1&ContentType=svg 200 OK { <?xml> <svg> … </svg> }

Example

Copy Code

PUT /ViewingSession/q/SourceFile?ViewingSessionId=upMaHhbPi&FileExtension=pdf
{
  [document data]
}

200 OK
 

GET /PCCIS/V1/ViewingSession/upMaHhbPi/Notification/SessionStarted

200 OK
 

GET /PCCIS/V1/Page/q/0?DocumentID=upMaHhbPi&Scale=1&ContentType=svg

200 OK
{
  <?xml>
  <svg>
  …
  </svg>
}

The PUT request and first GET request shown above are HTTP requests sent to PCCIS from your application’s web tier. For these requests, you have control over the format, and can choose either one in which to send web-tier originated requests. Once you select your preferred request format, it is recommended that you use it consistently to make it easier to find the viewing session ID in your proxy implementation.

The final GET request shown above is an HTTP request generated from the PCC client viewer. You do not have control over this format, so the proxy must also be able to locate the viewing session ID in the "DocumentID" query parameter. The PCC viewer will consistently use this query parameter to pass the viewing session ID.

The final note about the examples above is in regards to the "u" character prefix that is present before the viewing session ID in each case. This character is not actually part of the viewing session ID, but a prefix that is used internally by PCCIS and the PCC viewer. You should remove this prefix character before attempting to decode the viewing session ID.

Decoding & Decrypting the Viewing Session ID

Once located, the viewing session ID must be decoded, then decrypted to obtain the hostname of the PCCIS server in plain text.

To decode the viewing session ID, a Base64 decoder should be used. A Base64 encoding format that is URL-safe is used to make the viewing session ID safe for transmission over HTTP requests within query parameters. The specification that defines this format can be found here. The Apache Commons codec is a good example of a library that includes a Base64 class that offers a URL-safe decoding option.

The output from the Base64 decoder should be a byte array. The byte array will contain AES encrypted data that must be decrypted using a library that supports 128 bit AES decryption. The encryption key and iv (initialization vector) that PCCIS used to encrypt the data during the viewing session ID creation process can be found in the PCCIS configuration file named "pcc.config". For more details about these configuration values, please see the pcc.config topic. It is recommended that you change the key and iv values in the pcc.config file from their default values for maximum security.

The output from the AES decryption phase should be a plain-text string in the following format:

Example	Copy Code
“{internal viewing session GUID}/{PCCIS server hostname}/{Auth-Token header value}”

For example:

Example	Copy Code
“d6fea7fc-57d9-4f6c-83bd-839e2aeeeb5b/foo.docserver.001/accusoft”

This string value can be split on the slash character, "/", to produce the following three values:

An internal GUID, which is used for internal purposes only and can be discarded.
The PCCIS server hostname, which is the value that your proxy can use to forward on requests to the appropriate PCCIS server.
The value that was included in a header named "Auth-Token" that may be included in the initial POST request to create a viewing session. If this header is not present, this value will be "accusoft" by default. You can include this header with any value you chose when you create new viewing sessions. This value could be useful for additional authentication of requests in your proxy, for example.

Now that you have the hostname of the PCCIS server, your proxy can send the request on to the correct PCCIS server, wait for the response, then return a copy of the response back to the client that originally sent the request.